Model Selection

Multimodal Instruction Model

# Multimodal Instruction Model

Phi 4 Mm Inst Asr Singlish

A multimodal speech recognition model optimized for Singapore English, fine-tuned based on Microsoft's Phi-4 multimodal instruction model, significantly improving recognition of Singapore English's unique phonetic features.

Transformers Supports Multiple Languages

Typhoon2 Qwen2vl 7b Vision Instruct

Typhoon2-Vision is a Thai-supported visual language model capable of processing image and video inputs, specifically optimized for image-based applications.

Transformers Supports Multiple Languages

Xgen Mm Phi3 Mini Instruct Singleimg R V1.5

xGen-MM is a series of the latest foundational large multimodal models developed by Salesforce AI Research. It is improved based on the successful design of the BLIP series, providing more powerful multimodal processing capabilities.

Safetensors English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase